To address the problem of lack of labeled data in low-resource languages, which prevents the use of existing mature deep learning methods for Named Entity Recognition (NER), a cross-lingual NER model based on sentence-level Generative Adversarial Network (GAN), namely SLGAN-XLM-R (Sentence Level GAN based on XLM-R), was proposed. Firstly, the labeled data of the source language was used to train the NER model on the basis of the pre-trained model XLM-R (XLM-Robustly optimized BERT pretraining approach). At the same time, the linguistic adversarial training was performed on the embedding layer of XLM-R model by combining the unlabeled data of the target language. Then, the soft labels of the unlabeled data of the target language were predicted by using the NER model, Finally the labeled data of the source language and the target language was mixed to fine-tune the model again to obtain the final NER model. Experiments were conducted on four languages, English, German, Spanish, and Dutch, in two datasets, CoNLL2002 and CoNLL2003. The results show that with English as the source language, the F1 scores of SLGAN-XLM-R model on the test sets of German, Spanish, and Dutch are 72.70%, 79.42%, and 80.03%, respectively, which are 5.38, 5.38, and 3.05 percentage points higher compared to those of the direct fine-tuning on XLM-R model.
Three-way concept analysis is a very important topic in the field of artificial intelligence. The biggest advantage of this theory is that it can study “attributes that are commonly possessed” and “attributes that are commonly not possessed” of the objects in the formal context at the same time. It is well known that the new formal context generated by attribute clustering has a strong connection with the original formal context, and there is a close internal connection between the original three-way concepts and the new three-way concepts obtained by attribute clustering. Therefore, the comparative study and analysis of three-way concepts under attribute clustering were carried out. Firstly, the concepts of pessimistic, optimistic and general attribute clusterings were proposed on the basis of attribute clustering, and the relationship among these three concepts was studied. Moreover, the difference between the original three-way concepts and the new ones was studied by comparing the clustering process with the formal process of three-way concepts. Furthermore, two minimum constraint indexes were put forward from the perspective of object-oriented and attribute-oriented respectively, and the influence of attribute clustering on three-way concept lattice was explored. The above results further enrich the theory of three-way concept analysis and provide feasible ideas for the field of visual data processing.
Current main stream models cannot fully express the semantics of question and answer pairs, do not fully consider the relationships between the topic information of question and answer pairs, and the activation function has the problem of soft saturation, which affect the overall performance of the model. To solve these problems, an answer selection model based on pooling and feature combination enhanced BERT (Bi-directional Encoder Representations from Transformers) was proposed. Firstly, adversarial samples and pooling operation were introduced to represent the semantics of question and answer pairs based on the pre-training model BERT. Secondly, the relationships between topic information of question and answer pairs were strengthened by the feature combination of topic information. Finally, the activation function in the hidden layer was improved, and the splicing vector was used to complete the answer selection task through the hidden layer and classifier. Model validation was performed on datasets SemEval-2016CQA and SemEval-2017CQA. The results show that compared with tBERT model, the proposed model has the accuracy increased by 3.1 percentage points and 2.2 percentage points respectively, F1 score increased by 2.0 percentage points and 3.1 percentage points respectively. It can be seen that the comprehensive effect of the proposed model on the answer selection task is effectively improved, and both of the accuracy and F1 score of the model are better than those of the model for comparison.
In the Remaining Useful Life (RUL) prediction methods of aero-engine, the data at different time steps are not weighted simultaneously, including the original data and the extracted features, which leads to the problem of low accuracy of RUL prediction.Therefore, an RUL prediction method based on optimized hybrid model was proposed. Firstly, three different paths were chosen to extract features. 1) The mean value and trend coefficient of the original data were input into the fully connected network. 2) The original data were input into Bidirectional Long Short-Term Memory (Bi-LSTM) network, and the attention mechanism was used to process the obtained features. 3) The attention mechanism was used to process the original data, and the weighted features were input into Convolutional Neural Network (CNN) and Bi-LSTM network. Then, the idea of fusing multi-path features for prediction was adopted, the above-mentioned extracted features were fused and input into the fully connected network to obtain the RUL prediction result. Finally, the Company-Modular Aero-Propulsion System Simulation (C-MAPSS) datasets were used to verify the effectiveness of the method. Experimental results show that the proposed method performs well on all the four datasets. Taking FD001 dataset as an example, the Root Mean Square Error (RMSE) of the proposed method is reduced by 9.01% compared to that of Bi-LSTM network.
Aiming at the problem that the pre-training model BERT (Bidirectional Encoder Representation from Transformers) lacks of vocabulary information, a Chinese named entity recognition model called OpenKG + Entity Enhanced BERT + CRF (Conditional Random Field) based on knowledge base entity enhanced BERT model was proposed on the basis of the semi-supervised entity enhanced minimum mean-square error pre-training model. Firstly, documents were downloaded from Chinese general encyclopedia knowledge base CN-DBPedia and entities were extracted by Jieba Chinese text segmentation to expand entity dictionary. Then, the entities in the dictionary were embedded into BERT for pre-training. And the word vectors obtained from the training were input into Bidirectional Long-Short-Term Memory network (BiLSTM) for feature extraction. Finally, the results were corrected by CRF and output. Model validation was performed on datasets CLUENER 2020 and MSRA, and the proposed model was compared with Entity Enhanced BERT pre-training, BERT+BiLSTM, ERNIE and BiLSTM+CRF models. Experimental results show that compared with these four models, the proposed model has the F1 score increased by 1.63 percentage points and 1.1 percentage points, 3.93 percentage points and 5.35 percentage points, 2.42 percentage points and 4.63 percentage points, 6.79 and 7.55 percentage points, respectively in the two datasets. It can be seen that the comprehensive effect of the proposed model on named entity recognition is effectively improved, and the F1 scores of the model are better than those of the comparison models.
Focusing on the issue that embedding the attention mechanism module into Convolutional Neural Network (CNN) to improve the application accuracy will increase the parameters and the computational cost, the lightweight Height Dimensional Squeeze and Excitation (HD-SE) module and Width Dimensional Squeeze and Excitation (WD-SE) module based on squeeze and excitation were proposed. To make full use of the potential information in the feature maps, two kinds of height and width dimensional weight information of feature maps was respectively extracted by HD-SE and WD-SE through squeeze and excitation operations, then the obtained weight information was respectively applied to corresponding tensors of the feature maps of two dimensions to improve the application accuracy of the model. Experiments were implemented on CIFAR10 and CIFAR100 datasets after embedding HD-SE and WD-SE into Visual Geometry Group 16 (VGG16), Residual Network 56 (ResNet56), MobileNetV1 and MobileNetV2 models respectively. Experimental results show fewer parameters and computational cost added by HD-SE and WD-SE to the network models when the models achieve the same or even better accuracy, compared with the state-of-the-art attention mechanism modules, such as Squeeze and Excitation (SE) module, Coordinate Attention (CA) block, Convolutional Block Attention Module (CBAM) and Efficient Channel Attention (ECA) module.
With the rapid development of cloud computing technology, the number of data centers have increased significantly, and the subsequent energy consumption problem gradually become one of the research hotspots. Aiming at the problem of server energy consumption optimization, a data center server energy consumption optimization combining eXtreme Gradient Boosting (XGBoost) and Multi-Gated Recurrent Unit (Multi-GRU) (ECOXG) algorithm was proposed. Firstly, the data such as resource occupation information and energy consumption of each component of the servers were collected by the Linux terminal monitoring commands and power consumption meters, and the data were preprocessed to obtain the resource utilization rates. Secondly, the resource utilization rates were constructed in series into a time series in vector form, which was used to train the Multi-GRU load prediction model, and the simulated frequency reduction was performed to the servers according to the prediction results to obtain the load data after frequency reduction. Thirdly, the resource utilization rates of the servers were combined with the energy consumption data at the same time to train the XGBoost energy consumption prediction model. Finally, the load data after frequency reduction were input into the trained XGBoost model, and the energy consumption of the servers after frequency reduction was predicted. Experiments on the actual resource utilization data of 6 physical servers showed that ECOXG algorithm had a Root Mean Square Error (RMSE) reduced by 50.9%, 31.0%, 32.7%, 22.9% compared with Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM) network, CNN-GRU and CNN-LSTM models, respectively. Meanwhile, compared with LSTM, CNN-GRU and CNN-LSTM models, ECOXG algorithm saved 43.2%, 47.1%, 59.9% training time, respectively. Experimental results show that ECOXG algorithm can provide a theoretical basis for the prediction and optimization of server energy consumption optimization, and it is significantly better than the comparison algorithms in accuracy and operating efficiency. In addition, the power consumption of the server after the simulated frequency reduction is significantly lower than the real power consumption, and the effect of reducing energy consumption is outstanding when the utilization rates of the servers are low.
All market activities of stock market participants combine to affect stock market changes, making stock market volatility fraught with complexity and making accurate prediction of stock prices a challenge. Among these activities that affect stock market changes, financial disclosure is an attractive and potentially financially rewarding means of predicting stock indexe changes. In order to deal with the complex changes in the stock market, a method of stock index prediction was proposed that incorporates data from financial statements disclosed by corporates. Firstly, the stock index historical data and corporate financial statement data were preprocessed, and the main task is dimension reduction of the high-dimensional matrix generated from corporate financial statement data, and then the dual-channel Long Short-Term Memory (LSTM) network was used to forecast and research the normalized data. Experimental results on SSE 50 and CSI 300 Index datasets show that the prediction effect of the proposed method is better than that using only historical data of stock indexes.
Concerning the contradiction between edge-preserving and noise-suppressing in the process of image denoising, a patch similarity anisotropic diffusion algorithm based on variable exponent for image denoising was proposed. The algorithm combined adaptive Perona-Malik (PM) model based on variable exponent for image denoising and the idea of patch similarity, constructed a new edge indicator and a new diffusion coefficient function. The traditional anisotropic diffusion algorithms for image denoising based on the intensity similarity of each single pixel (or gradient information) to detect edge cannot effectively preserve weak edges and details such as texture. However, the proposed algorithm can preserve more detail information while removing the noise, since the algorithm utilizes the intensity similarity of neighbor pixels. The simulation results show that, compared with the traditional image denoising algorithms based on Partial Differential Equation (PDE), the proposed algorithm improves Signal-to-Noise ratio (SNR) and Peak-Signal-to-Noise Ratio (PSNR) to 16.602480dB and 31.284672dB respectively, and enhances anti-noise capability. At the same time, the filtered image preserves more detail features such as weak edges and textures and has good visual effects. Therefore, the algorithm achieves a good balance between noise reduction and edge maintenance.